Comparing effect sizes¶
Import packages¶
In [1]:
Copied!
import sys
sys.path.insert(0,"/home/yunye/work/gwaslab/src")
import gwaslab as gl
import pandas as pd
import sys
sys.path.insert(0,"/home/yunye/work/gwaslab/src")
import gwaslab as gl
import pandas as pd
Download sample sumstats¶
In [8]:
Copied!
!wget -O bbj_bmi_male.txt.gz http://jenger.riken.jp/2analysisresult_qtl_download/
!wget -O bbj_bmi_female.txt.gz http://jenger.riken.jp/4analysisresult_qtl_download/
!wget -O bbj_bmi_male.txt.gz http://jenger.riken.jp/2analysisresult_qtl_download/
!wget -O bbj_bmi_female.txt.gz http://jenger.riken.jp/4analysisresult_qtl_download/
--2023-02-05 20:54:37-- http://jenger.riken.jp/2analysisresult_qtl_download/ Resolving jenger.riken.jp (jenger.riken.jp)... 134.160.84.25 Connecting to jenger.riken.jp (jenger.riken.jp)|134.160.84.25|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 136657659 (130M) [text/plain] Saving to: ‘bbj_bmi_male.txt.gz’ bbj_bmi_male.txt.gz 100%[===================>] 130.33M 11.1MB/s in 12s 2023-02-05 20:54:49 (10.9 MB/s) - ‘bbj_bmi_male.txt.gz’ saved [136657659/136657659] --2023-02-05 20:54:49-- http://jenger.riken.jp/4analysisresult_qtl_download/ Resolving jenger.riken.jp (jenger.riken.jp)... 134.160.84.25 Connecting to jenger.riken.jp (jenger.riken.jp)|134.160.84.25|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 136645377 (130M) [text/plain] Saving to: ‘bbj_bmi_female.txt.gz’ bbj_bmi_female.txt. 100%[===================>] 130.31M 10.2MB/s in 12s 2023-02-05 20:55:01 (10.7 MB/s) - ‘bbj_bmi_female.txt.gz’ saved [136645377/136645377]
Creating comparison plot using beta and se¶
use gl.Sumstats object¶
In [2]:
Copied!
gl1 = gl.Sumstats("bbj_bmi_female.txt.gz",fmt="gwaslab",snpid="SNP",ea="REF",nea="ALT",sep="\t")
gl2 = gl.Sumstats("bbj_bmi_male.txt.gz",fmt="gwaslab",snpid="SNP",ea="REF",nea="ALT",sep="\t")
# auto extract lead SNPs and compare the effect sizes
a = gl.compare_effect(path1= gl1,
path2= gl2,
)
gl1 = gl.Sumstats("bbj_bmi_female.txt.gz",fmt="gwaslab",snpid="SNP",ea="REF",nea="ALT",sep="\t")
gl2 = gl.Sumstats("bbj_bmi_male.txt.gz",fmt="gwaslab",snpid="SNP",ea="REF",nea="ALT",sep="\t")
# auto extract lead SNPs and compare the effect sizes
a = gl.compare_effect(path1= gl1,
path2= gl2,
)
Fri Jun 23 00:44:28 2023 GWASLab version 3.4.15 https://cloufield.github.io/gwaslab/ Fri Jun 23 00:44:28 2023 (C) 2022-2023, Yunye He, Kamatani Lab, MIT License, gwaslab@gmail.com Fri Jun 23 00:44:28 2023 Start to load format from formatbook.... Fri Jun 23 00:44:28 2023 -gwaslab format meta info: Fri Jun 23 00:44:28 2023 - format_name : gwaslab Fri Jun 23 00:44:28 2023 - format_source : https://cloufield.github.io/gwaslab/ Fri Jun 23 00:44:28 2023 - format_version : 20220729_v3 Fri Jun 23 00:44:28 2023 Start to initiate from file :bbj_bmi_female.txt.gz Fri Jun 23 00:44:34 2023 -Reading columns : CHR,BETA,ALT,POS,SNP,P,REF,SE Fri Jun 23 00:44:34 2023 -Renaming columns to : CHR,BETA,NEA,POS,SNPID,P,EA,SE Fri Jun 23 00:44:34 2023 -Current Dataframe shape : 5961600 x 8 Fri Jun 23 00:44:34 2023 -Initiating a status column: STATUS ... Fri Jun 23 00:44:36 2023 Start to reorder the columns... Fri Jun 23 00:44:36 2023 -Current Dataframe shape : 5961600 x 9 Fri Jun 23 00:44:36 2023 -Reordering columns to : SNPID,CHR,POS,EA,NEA,BETA,SE,P,STATUS Fri Jun 23 00:44:36 2023 Finished sorting columns successfully! Fri Jun 23 00:44:36 2023 Finished loading data successfully! Fri Jun 23 00:44:36 2023 GWASLab version 3.4.15 https://cloufield.github.io/gwaslab/ Fri Jun 23 00:44:36 2023 (C) 2022-2023, Yunye He, Kamatani Lab, MIT License, gwaslab@gmail.com Fri Jun 23 00:44:36 2023 Start to load format from formatbook.... Fri Jun 23 00:44:36 2023 -gwaslab format meta info: Fri Jun 23 00:44:36 2023 - format_name : gwaslab Fri Jun 23 00:44:36 2023 - format_source : https://cloufield.github.io/gwaslab/ Fri Jun 23 00:44:36 2023 - format_version : 20220729_v3 Fri Jun 23 00:44:36 2023 Start to initiate from file :bbj_bmi_male.txt.gz Fri Jun 23 00:44:42 2023 -Reading columns : CHR,BETA,ALT,POS,SNP,P,REF,SE Fri Jun 23 00:44:42 2023 -Renaming columns to : CHR,BETA,NEA,POS,SNPID,P,EA,SE Fri Jun 23 00:44:42 2023 -Current Dataframe shape : 5961600 x 8 Fri Jun 23 00:44:42 2023 -Initiating a status column: STATUS ... Fri Jun 23 00:44:43 2023 Start to reorder the columns... Fri Jun 23 00:44:43 2023 -Current Dataframe shape : 5961600 x 9 Fri Jun 23 00:44:43 2023 -Reordering columns to : SNPID,CHR,POS,EA,NEA,BETA,SE,P,STATUS Fri Jun 23 00:44:44 2023 Finished sorting columns successfully! Fri Jun 23 00:44:44 2023 Finished loading data successfully!
In [ ]:
Copied!
use pd.DataFrame¶
In [3]:
Copied!
pd1 = pd.read_table("bbj_bmi_female.txt.gz",sep="\t")
pd2 = pd.read_table("bbj_bmi_male.txt.gz",sep="\t")
# cols_name_list should be SNPID, P, Effect Allele, Non-Effect allele, Chromosome and Position
# effect_cols_list should be BETA,SE
a = gl.compare_effect(path1 = pd1,
cols_name_list_1 = ["SNP","P","REF","ALT","CHR","POS"],effect_cols_list_1= ["BETA","SE"],
path2 = pd2,
cols_name_list_2 = ["SNP","P","REF","ALT","CHR","POS"],effect_cols_list_2= ["BETA","SE"]
)
pd1 = pd.read_table("bbj_bmi_female.txt.gz",sep="\t")
pd2 = pd.read_table("bbj_bmi_male.txt.gz",sep="\t")
# cols_name_list should be SNPID, P, Effect Allele, Non-Effect allele, Chromosome and Position
# effect_cols_list should be BETA,SE
a = gl.compare_effect(path1 = pd1,
cols_name_list_1 = ["SNP","P","REF","ALT","CHR","POS"],effect_cols_list_1= ["BETA","SE"],
path2 = pd2,
cols_name_list_2 = ["SNP","P","REF","ALT","CHR","POS"],effect_cols_list_2= ["BETA","SE"]
)
use tabular files¶
In [4]:
Copied!
# SNP CHR POS A1 A2 A1Frq Rsq BETA SE P
# gwaslab will automatically extract significant variants from both sumstats.
a = gl.compare_effect(path1="bbj_bmi_female.txt.gz",
cols_name_list_1=["SNP","P","REF","ALT","CHR","POS"],effect_cols_list_1=["BETA","SE"],
path2="bbj_bmi_male.txt.gz",
cols_name_list_2=["SNP","P","REF","ALT","CHR","POS"],effect_cols_list_2=["BETA","SE"],
label=["Female","Male","Both","None"],
xylabel_prefix="Per-allele effect size for ",
is_q=False,
sig_level=5e-6,
legend_title=r'$ P < 5 x 10^{-6}$ in:',
verbose=True
)
# SNP CHR POS A1 A2 A1Frq Rsq BETA SE P
# gwaslab will automatically extract significant variants from both sumstats.
a = gl.compare_effect(path1="bbj_bmi_female.txt.gz",
cols_name_list_1=["SNP","P","REF","ALT","CHR","POS"],effect_cols_list_1=["BETA","SE"],
path2="bbj_bmi_male.txt.gz",
cols_name_list_2=["SNP","P","REF","ALT","CHR","POS"],effect_cols_list_2=["BETA","SE"],
label=["Female","Male","Both","None"],
xylabel_prefix="Per-allele effect size for ",
is_q=False,
sig_level=5e-6,
legend_title=r'$ P < 5 x 10^{-6}$ in:',
verbose=True
)
Fri Jun 23 00:45:52 2023 Start to process the raw sumstats for plotting... Fri Jun 23 00:45:52 2023 -Loading Male SNP list in memory... Fri Jun 23 00:45:57 2023 -Loading sumstats for Female: SNP,P,CHR,POS Fri Jun 23 00:46:04 2023 -Counting variants available for both datasets: 5961600 variants... Fri Jun 23 00:46:08 2023 -Using only variants available for both datasets... Fri Jun 23 00:46:08 2023 -Extract lead variants from Female... Fri Jun 23 00:46:08 2023 Start to extract lead variants... Fri Jun 23 00:46:08 2023 -Processing 5961600 variants... Fri Jun 23 00:46:08 2023 -Significance threshold : 5e-06 Fri Jun 23 00:46:08 2023 -Sliding window size: 500 kb Fri Jun 23 00:46:09 2023 -Found 3455 significant variants in total... Fri Jun 23 00:46:09 2023 -Identified 66 lead variants! Fri Jun 23 00:46:09 2023 Finished extracting lead variants successfully! Fri Jun 23 00:46:09 2023 -Loading sumstats for Male: SNP,P,CHR,POS Fri Jun 23 00:46:20 2023 -Extract lead snps from Male... Fri Jun 23 00:46:20 2023 Start to extract lead variants... Fri Jun 23 00:46:20 2023 -Processing 5961600 variants... Fri Jun 23 00:46:20 2023 -Significance threshold : 5e-06 Fri Jun 23 00:46:20 2023 -Sliding window size: 500 kb Fri Jun 23 00:46:21 2023 -Found 5375 significant variants in total... Fri Jun 23 00:46:21 2023 -Identified 85 lead variants! Fri Jun 23 00:46:21 2023 Finished extracting lead variants successfully! Fri Jun 23 00:46:21 2023 Merging snps from Female and Male... Fri Jun 23 00:46:21 2023 -Extract statistics of selected variants from Female : SNP,P,REF,ALT,BETA,SE Fri Jun 23 00:46:27 2023 -Merging Female effect information... Fri Jun 23 00:46:30 2023 -Extract statistics of selected variants from Male : SNP,P,REF,ALT,BETA,SE Fri Jun 23 00:46:36 2023 -Merging Male effect information... Fri Jun 23 00:46:38 2023 -Updating missing information for Female ... Fri Jun 23 00:46:45 2023 -Updating missing information for Male ... Fri Jun 23 00:46:52 2023 -Assigning indicator ... Fri Jun 23 00:46:52 2023 -Aligning Male EA with Female EA ... Fri Jun 23 00:46:52 2023 -Aligned all EAs in Male with EAs in Female ... Fri Jun 23 00:46:52 2023 -Saving the merged data to: Female_Male_beta_sig_list_merged.tsv Fri Jun 23 00:46:52 2023 -Identified 0 variants which are not significant in None. Fri Jun 23 00:46:52 2023 -Identified 51 variants which are only significant in Female. Fri Jun 23 00:46:52 2023 -Identified 70 variants which are only significant in Male. Fri Jun 23 00:46:52 2023 -Identified 27 variants which are significant in Both. Fri Jun 23 00:46:52 2023 Creating the scatter plot for effect sizes comparison... Fri Jun 23 00:46:52 2023 -Calculating p values based on given null slope : 0 Fri Jun 23 00:46:52 2023 -Beta_se = 0.054318056481159616 Fri Jun 23 00:46:52 2023 -H0 beta = 0 , default p = 1.91e-26
OR mode¶
In [5]:
Copied!
gl1 = gl.Sumstats("bbj_bmi_female.txt.gz",fmt="gwaslab",snpid="SNP",ea="REF",nea="ALT",sep="\t")
gl2 = gl.Sumstats("bbj_bmi_male.txt.gz",fmt="gwaslab",snpid="SNP",ea="REF",nea="ALT",sep="\t")
gl1.fill_data(to_fill=["OR"])
gl2.fill_data(to_fill=["OR"])
# auto extract lead SNPs and compare the effect sizes
a = gl.compare_effect(path1= gl1,
path2= gl2,
mode="OR",
is_q=True,
anno=True,
anno_het=True,
q_level=0.05,
verbose=True
)
gl1 = gl.Sumstats("bbj_bmi_female.txt.gz",fmt="gwaslab",snpid="SNP",ea="REF",nea="ALT",sep="\t")
gl2 = gl.Sumstats("bbj_bmi_male.txt.gz",fmt="gwaslab",snpid="SNP",ea="REF",nea="ALT",sep="\t")
gl1.fill_data(to_fill=["OR"])
gl2.fill_data(to_fill=["OR"])
# auto extract lead SNPs and compare the effect sizes
a = gl.compare_effect(path1= gl1,
path2= gl2,
mode="OR",
is_q=True,
anno=True,
anno_het=True,
q_level=0.05,
verbose=True
)
Fri Jun 23 00:46:53 2023 GWASLab version 3.4.15 https://cloufield.github.io/gwaslab/ Fri Jun 23 00:46:53 2023 (C) 2022-2023, Yunye He, Kamatani Lab, MIT License, gwaslab@gmail.com Fri Jun 23 00:46:53 2023 Start to load format from formatbook.... Fri Jun 23 00:46:53 2023 -gwaslab format meta info: Fri Jun 23 00:46:53 2023 - format_name : gwaslab Fri Jun 23 00:46:53 2023 - format_source : https://cloufield.github.io/gwaslab/ Fri Jun 23 00:46:53 2023 - format_version : 20220729_v3 Fri Jun 23 00:46:53 2023 Start to initiate from file :bbj_bmi_female.txt.gz Fri Jun 23 00:46:59 2023 -Reading columns : CHR,BETA,ALT,POS,SNP,P,REF,SE Fri Jun 23 00:46:59 2023 -Renaming columns to : CHR,BETA,NEA,POS,SNPID,P,EA,SE Fri Jun 23 00:46:59 2023 -Current Dataframe shape : 5961600 x 8 Fri Jun 23 00:46:59 2023 -Initiating a status column: STATUS ... Fri Jun 23 00:47:01 2023 Start to reorder the columns... Fri Jun 23 00:47:01 2023 -Current Dataframe shape : 5961600 x 9 Fri Jun 23 00:47:01 2023 -Reordering columns to : SNPID,CHR,POS,EA,NEA,BETA,SE,P,STATUS Fri Jun 23 00:47:01 2023 Finished sorting columns successfully! Fri Jun 23 00:47:01 2023 Finished loading data successfully! Fri Jun 23 00:47:01 2023 GWASLab version 3.4.15 https://cloufield.github.io/gwaslab/ Fri Jun 23 00:47:01 2023 (C) 2022-2023, Yunye He, Kamatani Lab, MIT License, gwaslab@gmail.com Fri Jun 23 00:47:01 2023 Start to load format from formatbook.... Fri Jun 23 00:47:01 2023 -gwaslab format meta info: Fri Jun 23 00:47:01 2023 - format_name : gwaslab Fri Jun 23 00:47:01 2023 - format_source : https://cloufield.github.io/gwaslab/ Fri Jun 23 00:47:01 2023 - format_version : 20220729_v3 Fri Jun 23 00:47:01 2023 Start to initiate from file :bbj_bmi_male.txt.gz Fri Jun 23 00:47:07 2023 -Reading columns : CHR,BETA,ALT,POS,SNP,P,REF,SE Fri Jun 23 00:47:07 2023 -Renaming columns to : CHR,BETA,NEA,POS,SNPID,P,EA,SE Fri Jun 23 00:47:07 2023 -Current Dataframe shape : 5961600 x 8 Fri Jun 23 00:47:07 2023 -Initiating a status column: STATUS ... Fri Jun 23 00:47:08 2023 Start to reorder the columns... Fri Jun 23 00:47:08 2023 -Current Dataframe shape : 5961600 x 9 Fri Jun 23 00:47:08 2023 -Reordering columns to : SNPID,CHR,POS,EA,NEA,BETA,SE,P,STATUS Fri Jun 23 00:47:08 2023 Finished sorting columns successfully! Fri Jun 23 00:47:09 2023 Finished loading data successfully! Fri Jun 23 00:47:09 2023 Start filling data using existing columns... Fri Jun 23 00:47:09 2023 -Raw input columns: ['SNPID', 'CHR', 'POS', 'EA', 'NEA', 'BETA', 'SE', 'P', 'STATUS'] Fri Jun 23 00:47:09 2023 -Overwrite mode: False Fri Jun 23 00:47:09 2023 -Skipping columns: [] Fri Jun 23 00:47:09 2023 -Filling columns: ['OR'] Fri Jun 23 00:47:09 2023 - Filling Columns iteratively... Fri Jun 23 00:47:09 2023 - Filling OR using BETA column... Fri Jun 23 00:47:09 2023 - Filling OR_95L/OR_95U using BETA/SE columns... Fri Jun 23 00:47:09 2023 Start to reorder the columns... Fri Jun 23 00:47:09 2023 -Current Dataframe shape : 5961600 x 12 Fri Jun 23 00:47:09 2023 -Reordering columns to : SNPID,CHR,POS,EA,NEA,BETA,SE,P,OR,OR_95L,OR_95U,STATUS Fri Jun 23 00:47:09 2023 Finished sorting columns successfully! Fri Jun 23 00:47:09 2023 Finished filling data using existing columns. Fri Jun 23 00:47:09 2023 Start filling data using existing columns... Fri Jun 23 00:47:09 2023 -Raw input columns: ['SNPID', 'CHR', 'POS', 'EA', 'NEA', 'BETA', 'SE', 'P', 'STATUS'] Fri Jun 23 00:47:09 2023 -Overwrite mode: False Fri Jun 23 00:47:09 2023 -Skipping columns: [] Fri Jun 23 00:47:09 2023 -Filling columns: ['OR'] Fri Jun 23 00:47:09 2023 - Filling Columns iteratively... Fri Jun 23 00:47:09 2023 - Filling OR using BETA column... Fri Jun 23 00:47:09 2023 - Filling OR_95L/OR_95U using BETA/SE columns... Fri Jun 23 00:47:09 2023 Start to reorder the columns... Fri Jun 23 00:47:09 2023 -Current Dataframe shape : 5961600 x 12 Fri Jun 23 00:47:09 2023 -Reordering columns to : SNPID,CHR,POS,EA,NEA,BETA,SE,P,OR,OR_95L,OR_95U,STATUS Fri Jun 23 00:47:09 2023 Finished sorting columns successfully! Fri Jun 23 00:47:09 2023 Finished filling data using existing columns. Fri Jun 23 00:47:09 2023 Start to process the raw sumstats for plotting... Fri Jun 23 00:47:09 2023 Path1 is gwaslab Sumstats object... Fri Jun 23 00:47:09 2023 Path2 is gwaslab Sumstats object... Fri Jun 23 00:47:09 2023 -Loading Sumstats_2 SNP list in memory... Fri Jun 23 00:47:10 2023 -Loading sumstats for Sumstats_1: SNPID,P,CHR,POS Fri Jun 23 00:47:13 2023 -Counting variants available for both datasets: 5961600 variants... Fri Jun 23 00:47:17 2023 -Using only variants available for both datasets... Fri Jun 23 00:47:17 2023 -Extract lead variants from Sumstats_1... Fri Jun 23 00:47:17 2023 Start to extract lead variants... Fri Jun 23 00:47:17 2023 -Processing 5961600 variants... Fri Jun 23 00:47:17 2023 -Significance threshold : 5e-08 Fri Jun 23 00:47:17 2023 -Sliding window size: 500 kb Fri Jun 23 00:47:19 2023 -Found 948 significant variants in total... Fri Jun 23 00:47:19 2023 -Identified 20 lead variants! Fri Jun 23 00:47:19 2023 Finished extracting lead variants successfully! Fri Jun 23 00:47:20 2023 -Loading sumstats for Sumstats_2: SNPID,P,CHR,POS Fri Jun 23 00:47:25 2023 -Extract lead snps from Sumstats_2... Fri Jun 23 00:47:25 2023 Start to extract lead variants... Fri Jun 23 00:47:25 2023 -Processing 5961600 variants... Fri Jun 23 00:47:25 2023 -Significance threshold : 5e-08 Fri Jun 23 00:47:25 2023 -Sliding window size: 500 kb Fri Jun 23 00:47:28 2023 -Found 1937 significant variants in total... Fri Jun 23 00:47:28 2023 -Identified 28 lead variants! Fri Jun 23 00:47:28 2023 Finished extracting lead variants successfully! Fri Jun 23 00:47:28 2023 Merging snps from Sumstats_1 and Sumstats_2... Fri Jun 23 00:47:28 2023 -Extract statistics of selected variants from Sumstats_1 : SNPID,P,EA,NEA,OR,OR_95L,OR_95U Fri Jun 23 00:47:28 2023 -Merging Sumstats_1 effect information... Fri Jun 23 00:47:31 2023 -Extract statistics of selected variants from Sumstats_2 : SNPID,P,EA,NEA,OR,OR_95L,OR_95U Fri Jun 23 00:47:32 2023 -Merging Sumstats_2 effect information... Fri Jun 23 00:47:34 2023 -Updating missing information for Sumstats_1 ... Fri Jun 23 00:47:36 2023 -Updating missing information for Sumstats_2 ... Fri Jun 23 00:47:38 2023 -Assigning indicator ... Fri Jun 23 00:47:38 2023 -Aligning Sumstats_2 EA with Sumstats_1 EA ... Fri Jun 23 00:47:38 2023 -Aligned all EAs in Sumstats_2 with EAs in Sumstats_1 ... Fri Jun 23 00:47:38 2023 -Calculating Cochran's Q statistics and peform chisq test... Fri Jun 23 00:47:38 2023 -Saving the merged data to: Sumstats_1_Sumstats_2_beta_sig_list_merged.tsv Fri Jun 23 00:47:38 2023 -Significant het: 20 Fri Jun 23 00:47:38 2023 -All sig: 46 Fri Jun 23 00:47:38 2023 -Het rate: 0.43478260869565216 Fri Jun 23 00:47:38 2023 -Identified 0 variants which are not significant in None. Fri Jun 23 00:47:38 2023 -Identified 10 variants which are only significant in Sumstats_1. Fri Jun 23 00:47:38 2023 -Identified 18 variants which are only significant in Sumstats_2. Fri Jun 23 00:47:38 2023 -Identified 18 variants which are significant in Both. Fri Jun 23 00:47:38 2023 Creating the scatter plot for effect sizes comparison... Fri Jun 23 00:47:41 2023 -Calculating p values based on given null slope : 0 Fri Jun 23 00:47:41 2023 -Beta_se = 0.06906387999309058 Fri Jun 23 00:47:41 2023 -H0 beta = 0 , default p = 4.70e-18
Specify the snps you want to check¶
In [6]:
Copied!
#randomly extract 15 snps for comparison
# zcat bbj_bmi_female.txt.gz | shuf -n 25 | cut -f 1
#randomly extract 15 snps for comparison
# zcat bbj_bmi_female.txt.gz | shuf -n 25 | cut -f 1
In [7]:
Copied!
# store them in a list
snps = [
"rs1863794",
"rs78031431",
"rs2065160",
"rs79873757",
"rs76557076",
"rs75763355",
"rs73465683",
"rs7259784",
"rs12752457",
"rs1984969",
"rs4598662",
"rs36013795",
"rs7411446",
"rs3785568",
"rs13850768",
"rs77247065",
"rs7718738",
"rs74800130",
"rs79767381",
"rs6544774",
"rs245902",
"rs7994403",
"rs79658166",
"rs6822158",
"rs73442063"
]
# store them in a list
snps = [
"rs1863794",
"rs78031431",
"rs2065160",
"rs79873757",
"rs76557076",
"rs75763355",
"rs73465683",
"rs7259784",
"rs12752457",
"rs1984969",
"rs4598662",
"rs36013795",
"rs7411446",
"rs3785568",
"rs13850768",
"rs77247065",
"rs7718738",
"rs74800130",
"rs79767381",
"rs6544774",
"rs245902",
"rs7994403",
"rs79658166",
"rs6822158",
"rs73442063"
]
Creating comparison plot using beta and se of specified variants¶
In [8]:
Copied!
# SNP CHR POS A1 A2 A1Frq Rsq BETA SE P
a = gl.compare_effect(path1="bbj_bmi_female.txt.gz",
cols_name_list_1=["SNP","P","REF","ALT","CHR","POS"],effect_cols_list_1=["BETA","SE"],
path2="bbj_bmi_male.txt.gz",
cols_name_list_2=["SNP","P","REF","ALT","CHR","POS"],effect_cols_list_2=["BETA","SE"],
label=["Female","Male","Both","None"],
xylabel_prefix="Per-allele effect size for ",
sig_level=0.1,
legend_title=r'$ P < 0.1$ in:',
snplist=snps,
verbose=True
)
# SNP CHR POS A1 A2 A1Frq Rsq BETA SE P
a = gl.compare_effect(path1="bbj_bmi_female.txt.gz",
cols_name_list_1=["SNP","P","REF","ALT","CHR","POS"],effect_cols_list_1=["BETA","SE"],
path2="bbj_bmi_male.txt.gz",
cols_name_list_2=["SNP","P","REF","ALT","CHR","POS"],effect_cols_list_2=["BETA","SE"],
label=["Female","Male","Both","None"],
xylabel_prefix="Per-allele effect size for ",
sig_level=0.1,
legend_title=r'$ P < 0.1$ in:',
snplist=snps,
verbose=True
)
Fri Jun 23 00:47:42 2023 Start to process the raw sumstats for plotting... Fri Jun 23 00:47:42 2023 -Loading Male SNP list in memory... Fri Jun 23 00:47:47 2023 -Loading sumstats for Female: SNP,P Fri Jun 23 00:47:54 2023 -Counting variants available for both datasets: 5961600 variants... Fri Jun 23 00:47:58 2023 -Using only variants available for both datasets... Fri Jun 23 00:47:58 2023 -Extract variants in the given list from Female... Fri Jun 23 00:47:58 2023 -Loading sumstats for Male: SNP,P Fri Jun 23 00:48:08 2023 -Extract snps in the given list from Male... Fri Jun 23 00:48:09 2023 Merging snps from Female and Male... Fri Jun 23 00:48:09 2023 -Extract statistics of selected variants from Female : SNP,P,REF,ALT,BETA,SE Fri Jun 23 00:48:14 2023 -Merging Female effect information... Fri Jun 23 00:48:17 2023 -Extract statistics of selected variants from Male : SNP,P,REF,ALT,BETA,SE Fri Jun 23 00:48:23 2023 -Merging Male effect information... Fri Jun 23 00:48:26 2023 -Updating missing information for Female ... Fri Jun 23 00:48:33 2023 -Updating missing information for Male ... Fri Jun 23 00:48:40 2023 -Assigning indicator ... Fri Jun 23 00:48:40 2023 -Aligning Male EA with Female EA ... Fri Jun 23 00:48:40 2023 -Aligned all EAs in Male with EAs in Female ... Fri Jun 23 00:48:40 2023 -Saving the merged data to: Female_Male_beta_sig_list_merged.tsv Fri Jun 23 00:48:40 2023 -Identified 17 variants which are not significant in None. Fri Jun 23 00:48:40 2023 -Identified 4 variants which are only significant in Female. Fri Jun 23 00:48:40 2023 -Identified 3 variants which are only significant in Male. Fri Jun 23 00:48:40 2023 -Identified 0 variants which are significant in Both. Fri Jun 23 00:48:40 2023 Creating the scatter plot for effect sizes comparison... Fri Jun 23 00:48:40 2023 -Calculating p values based on given null slope : 0 Fri Jun 23 00:48:40 2023 -Beta_se = 0.3611078876446907 Fri Jun 23 00:48:40 2023 -H0 beta = 0 , default p = 7.21e-01
Check raw data¶
In [9]:
Copied!
#data
a[0]
#data
a[0]
Out[9]:
| P_1 | P_2 | EA_1 | NEA_1 | EFFECT_1 | SE_1 | EA_2 | NEA_2 | EFFECT_2 | SE_2 | indicator | EA_2_aligned | NEA_2_aligned | EFFECT_2_aligned | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| SNPID | ||||||||||||||
| rs12752457 | 0.04261 | 0.53740 | G | A | 0.017860 | 0.008810 | G | A | 0.004928 | 0.007991 | 1 | G | A | 0.004928 |
| rs1863794 | 0.38560 | 0.40490 | C | T | 0.004661 | 0.005372 | C | T | -0.004097 | 0.004919 | 0 | C | T | -0.004097 |
| rs1984969 | 0.22880 | 0.90250 | A | G | -0.008401 | 0.006981 | A | G | 0.000782 | 0.006386 | 0 | A | G | 0.000782 |
| rs2065160 | 0.21280 | 0.96720 | A | G | -0.007738 | 0.006211 | A | G | 0.000236 | 0.005732 | 0 | A | G | 0.000236 |
| rs245902 | 0.10370 | 0.41980 | T | A | 0.009501 | 0.005838 | T | A | 0.004318 | 0.005352 | 0 | T | A | 0.004318 |
| rs36013795 | 0.19560 | 0.47450 | C | T | -0.007498 | 0.005794 | C | T | 0.003788 | 0.005296 | 0 | C | T | 0.003788 |
| rs3785568 | 0.61490 | 0.93730 | T | C | -0.004761 | 0.009462 | T | C | -0.000689 | 0.008758 | 0 | T | C | -0.000689 |
| rs4598662 | 0.30970 | 0.89010 | A | G | -0.007636 | 0.007517 | A | G | 0.000945 | 0.006843 | 0 | A | G | 0.000945 |
| rs6544774 | 0.27650 | 0.46020 | G | A | 0.012160 | 0.011170 | G | A | -0.007611 | 0.010310 | 0 | G | A | -0.007611 |
| rs6822158 | 0.97080 | 0.04674 | C | T | 0.001801 | 0.049120 | C | T | 0.085530 | 0.043010 | 2 | C | T | 0.085530 |
| rs7259784 | 0.22590 | 0.39740 | A | G | 0.008678 | 0.007166 | A | G | 0.005579 | 0.006592 | 0 | A | G | 0.005579 |
| rs73442063 | 0.47200 | 0.07833 | A | G | 0.004324 | 0.006012 | A | G | 0.009727 | 0.005525 | 2 | A | G | 0.009727 |
| rs73465683 | 0.77150 | 0.97320 | T | C | -0.003489 | 0.012020 | T | C | -0.000374 | 0.011100 | 0 | T | C | -0.000374 |
| rs7411446 | 0.59270 | 0.58580 | C | T | 0.005894 | 0.011020 | C | T | -0.005507 | 0.010110 | 0 | C | T | -0.005507 |
| rs74800130 | 0.87420 | 0.77590 | G | A | -0.001590 | 0.010040 | G | A | 0.002642 | 0.009282 | 0 | G | A | 0.002642 |
| rs75763355 | 0.47580 | 0.94090 | T | C | -0.006767 | 0.009489 | T | C | -0.000645 | 0.008712 | 0 | T | C | -0.000645 |
| rs76557076 | 0.77800 | 0.51810 | A | G | 0.002150 | 0.007627 | A | G | 0.004468 | 0.006914 | 0 | A | G | 0.004468 |
| rs7718738 | 0.07710 | 0.53430 | G | A | -0.011250 | 0.006364 | G | A | -0.003617 | 0.005821 | 1 | G | A | -0.003617 |
| rs77247065 | 0.09607 | 0.89220 | C | T | -0.032720 | 0.019660 | C | T | -0.002492 | 0.018380 | 1 | C | T | -0.002492 |
| rs78031431 | 0.35110 | 0.45570 | G | A | 0.010760 | 0.011540 | G | A | -0.007951 | 0.010660 | 0 | G | A | -0.007951 |
| rs79658166 | 0.70730 | 0.97160 | A | G | 0.004161 | 0.011080 | A | G | 0.000364 | 0.010220 | 0 | A | G | 0.000364 |
| rs79767381 | 0.26660 | 0.39850 | C | G | -0.010760 | 0.009690 | C | G | 0.007546 | 0.008937 | 0 | C | G | 0.007546 |
| rs79873757 | 0.04708 | 0.11610 | G | A | -0.014100 | 0.007103 | G | A | -0.010350 | 0.006585 | 1 | G | A | -0.010350 |
| rs7994403 | 0.47960 | 0.09429 | G | T | -0.008070 | 0.011420 | G | T | 0.017470 | 0.010440 | 2 | G | T | 0.017470 |
In [10]:
Copied!
#figure
a[1]
#figure
a[1]
Out[10]:
In [11]:
Copied!
#log
a[2]
#log
a[2]
Out[11]:
<gwaslab.Log.Log at 0x7f1fce17b490>
heterogeneity test¶
In [12]:
Copied!
pd1 = pd.read_table("bbj_bmi_female.txt.gz",sep="\t")
pd2 = pd.read_table("bbj_bmi_male.txt.gz",sep="\t")
# cols_name_list should be SNPID, P, Effect Allele, Non-Effect allele, Chromosome and Position
# effect_cols_list should be BETA,SE
a = gl.compare_effect(path1 = pd1,
cols_name_list_1 = ["SNP","P","REF","ALT","CHR","POS"],effect_cols_list_1= ["BETA","SE"],
path2 = pd2,
cols_name_list_2 = ["SNP","P","REF","ALT","CHR","POS"],effect_cols_list_2= ["BETA","SE"],
is_q=True,
q_level=0.05,
legend_mode="full"
)
pd1 = pd.read_table("bbj_bmi_female.txt.gz",sep="\t")
pd2 = pd.read_table("bbj_bmi_male.txt.gz",sep="\t")
# cols_name_list should be SNPID, P, Effect Allele, Non-Effect allele, Chromosome and Position
# effect_cols_list should be BETA,SE
a = gl.compare_effect(path1 = pd1,
cols_name_list_1 = ["SNP","P","REF","ALT","CHR","POS"],effect_cols_list_1= ["BETA","SE"],
path2 = pd2,
cols_name_list_2 = ["SNP","P","REF","ALT","CHR","POS"],effect_cols_list_2= ["BETA","SE"],
is_q=True,
q_level=0.05,
legend_mode="full"
)
In [13]:
Copied!
# r_se=True : estimate the se for R using jackknife method
a = gl.compare_effect(path1="bbj_bmi_female.txt.gz",
cols_name_list_1=["SNP","P","REF","ALT","CHR","POS"],effect_cols_list_1=["BETA","SE"],
path2="bbj_bmi_male.txt.gz",
cols_name_list_2=["SNP","P","REF","ALT","CHR","POS"],effect_cols_list_2=["BETA","SE"],
label=["Female","Male","Both","None"],
xylabel_prefix="Per-allele effect size for ",
anno=True,
anno_diff=0.015,
is_q=True,
q_level=0.05,
anno_het=True,
sig_level=5e-8,
legend_title=r'$ P < 5 x 10^{-8}$ in:',
verbose=True
)
# r_se=True : estimate the se for R using jackknife method
a = gl.compare_effect(path1="bbj_bmi_female.txt.gz",
cols_name_list_1=["SNP","P","REF","ALT","CHR","POS"],effect_cols_list_1=["BETA","SE"],
path2="bbj_bmi_male.txt.gz",
cols_name_list_2=["SNP","P","REF","ALT","CHR","POS"],effect_cols_list_2=["BETA","SE"],
label=["Female","Male","Both","None"],
xylabel_prefix="Per-allele effect size for ",
anno=True,
anno_diff=0.015,
is_q=True,
q_level=0.05,
anno_het=True,
sig_level=5e-8,
legend_title=r'$ P < 5 x 10^{-8}$ in:',
verbose=True
)
Fri Jun 23 00:49:21 2023 Start to process the raw sumstats for plotting... Fri Jun 23 00:49:21 2023 -Loading Male SNP list in memory... Fri Jun 23 00:49:26 2023 -Loading sumstats for Female: SNP,P,CHR,POS Fri Jun 23 00:49:33 2023 -Counting variants available for both datasets: 5961600 variants... Fri Jun 23 00:49:37 2023 -Using only variants available for both datasets... Fri Jun 23 00:49:37 2023 -Extract lead variants from Female... Fri Jun 23 00:49:37 2023 Start to extract lead variants... Fri Jun 23 00:49:37 2023 -Processing 5961600 variants... Fri Jun 23 00:49:37 2023 -Significance threshold : 5e-08 Fri Jun 23 00:49:37 2023 -Sliding window size: 500 kb Fri Jun 23 00:49:38 2023 -Found 948 significant variants in total... Fri Jun 23 00:49:38 2023 -Identified 20 lead variants! Fri Jun 23 00:49:38 2023 Finished extracting lead variants successfully! Fri Jun 23 00:49:38 2023 -Loading sumstats for Male: SNP,P,CHR,POS Fri Jun 23 00:49:49 2023 -Extract lead snps from Male... Fri Jun 23 00:49:49 2023 Start to extract lead variants... Fri Jun 23 00:49:49 2023 -Processing 5961600 variants... Fri Jun 23 00:49:49 2023 -Significance threshold : 5e-08 Fri Jun 23 00:49:49 2023 -Sliding window size: 500 kb Fri Jun 23 00:49:50 2023 -Found 1937 significant variants in total... Fri Jun 23 00:49:50 2023 -Identified 28 lead variants! Fri Jun 23 00:49:50 2023 Finished extracting lead variants successfully! Fri Jun 23 00:49:50 2023 Merging snps from Female and Male... Fri Jun 23 00:49:50 2023 -Extract statistics of selected variants from Female : SNP,P,REF,ALT,BETA,SE Fri Jun 23 00:49:56 2023 -Merging Female effect information... Fri Jun 23 00:49:59 2023 -Extract statistics of selected variants from Male : SNP,P,REF,ALT,BETA,SE Fri Jun 23 00:50:05 2023 -Merging Male effect information... Fri Jun 23 00:50:07 2023 -Updating missing information for Female ... Fri Jun 23 00:50:14 2023 -Updating missing information for Male ... Fri Jun 23 00:50:21 2023 -Assigning indicator ... Fri Jun 23 00:50:21 2023 -Aligning Male EA with Female EA ... Fri Jun 23 00:50:21 2023 -Aligned all EAs in Male with EAs in Female ... Fri Jun 23 00:50:21 2023 -Calculating Cochran's Q statistics and peform chisq test... Fri Jun 23 00:50:21 2023 -Saving the merged data to: Female_Male_beta_sig_list_merged.tsv Fri Jun 23 00:50:21 2023 -Significant het: 20 Fri Jun 23 00:50:21 2023 -All sig: 46 Fri Jun 23 00:50:21 2023 -Het rate: 0.43478260869565216 Fri Jun 23 00:50:21 2023 -Identified 0 variants which are not significant in None. Fri Jun 23 00:50:21 2023 -Identified 10 variants which are only significant in Female. Fri Jun 23 00:50:21 2023 -Identified 18 variants which are only significant in Male. Fri Jun 23 00:50:21 2023 -Identified 18 variants which are significant in Both. Fri Jun 23 00:50:21 2023 Creating the scatter plot for effect sizes comparison... Fri Jun 23 00:50:24 2023 -Calculating p values based on given null slope : 0 Fri Jun 23 00:50:24 2023 -Beta_se = 0.06938548104838471 Fri Jun 23 00:50:24 2023 -H0 beta = 0 , default p = 4.75e-18
annotate¶
In [14]:
Copied!
# r2_se=True : estimate the se for R2 using jackknife method
a = gl.compare_effect(path1="bbj_bmi_female.txt.gz",
cols_name_list_1=["SNP","P","REF","ALT","CHR","POS"],effect_cols_list_1=["BETA","SE"],
path2="bbj_bmi_male.txt.gz",
cols_name_list_2=["SNP","P","REF","ALT","CHR","POS"],effect_cols_list_2=["BETA","SE"],
label=["Female","Male","Both","None"],
xylabel_prefix="Per-allele effect size for ",
r_se=True,
is_q=True,
anno=True,
anno_het=True,
anno_diff=0.02,
sig_level=5e-8,
legend_title=r'$ P < 5 x 10^{-8}$ in:',
verbose=True
)
# r2_se=True : estimate the se for R2 using jackknife method
a = gl.compare_effect(path1="bbj_bmi_female.txt.gz",
cols_name_list_1=["SNP","P","REF","ALT","CHR","POS"],effect_cols_list_1=["BETA","SE"],
path2="bbj_bmi_male.txt.gz",
cols_name_list_2=["SNP","P","REF","ALT","CHR","POS"],effect_cols_list_2=["BETA","SE"],
label=["Female","Male","Both","None"],
xylabel_prefix="Per-allele effect size for ",
r_se=True,
is_q=True,
anno=True,
anno_het=True,
anno_diff=0.02,
sig_level=5e-8,
legend_title=r'$ P < 5 x 10^{-8}$ in:',
verbose=True
)
Fri Jun 23 00:50:25 2023 Start to process the raw sumstats for plotting... Fri Jun 23 00:50:25 2023 -Loading Male SNP list in memory... Fri Jun 23 00:50:31 2023 -Loading sumstats for Female: SNP,P,CHR,POS Fri Jun 23 00:50:37 2023 -Counting variants available for both datasets: 5961600 variants... Fri Jun 23 00:50:41 2023 -Using only variants available for both datasets... Fri Jun 23 00:50:41 2023 -Extract lead variants from Female... Fri Jun 23 00:50:41 2023 Start to extract lead variants... Fri Jun 23 00:50:41 2023 -Processing 5961600 variants... Fri Jun 23 00:50:41 2023 -Significance threshold : 5e-08 Fri Jun 23 00:50:41 2023 -Sliding window size: 500 kb Fri Jun 23 00:50:42 2023 -Found 948 significant variants in total... Fri Jun 23 00:50:42 2023 -Identified 20 lead variants! Fri Jun 23 00:50:42 2023 Finished extracting lead variants successfully! Fri Jun 23 00:50:43 2023 -Loading sumstats for Male: SNP,P,CHR,POS Fri Jun 23 00:50:53 2023 -Extract lead snps from Male... Fri Jun 23 00:50:53 2023 Start to extract lead variants... Fri Jun 23 00:50:53 2023 -Processing 5961600 variants... Fri Jun 23 00:50:53 2023 -Significance threshold : 5e-08 Fri Jun 23 00:50:53 2023 -Sliding window size: 500 kb Fri Jun 23 00:50:54 2023 -Found 1937 significant variants in total... Fri Jun 23 00:50:54 2023 -Identified 28 lead variants! Fri Jun 23 00:50:54 2023 Finished extracting lead variants successfully! Fri Jun 23 00:50:55 2023 Merging snps from Female and Male... Fri Jun 23 00:50:55 2023 -Extract statistics of selected variants from Female : SNP,P,REF,ALT,BETA,SE Fri Jun 23 00:51:00 2023 -Merging Female effect information... Fri Jun 23 00:51:03 2023 -Extract statistics of selected variants from Male : SNP,P,REF,ALT,BETA,SE Fri Jun 23 00:51:09 2023 -Merging Male effect information... Fri Jun 23 00:51:12 2023 -Updating missing information for Female ... Fri Jun 23 00:51:19 2023 -Updating missing information for Male ... Fri Jun 23 00:51:26 2023 -Assigning indicator ... Fri Jun 23 00:51:26 2023 -Aligning Male EA with Female EA ... Fri Jun 23 00:51:26 2023 -Aligned all EAs in Male with EAs in Female ... Fri Jun 23 00:51:26 2023 -Calculating Cochran's Q statistics and peform chisq test... Fri Jun 23 00:51:26 2023 -Saving the merged data to: Female_Male_beta_sig_list_merged.tsv Fri Jun 23 00:51:26 2023 -Significant het: 20 Fri Jun 23 00:51:26 2023 -All sig: 46 Fri Jun 23 00:51:26 2023 -Het rate: 0.43478260869565216 Fri Jun 23 00:51:26 2023 -Identified 0 variants which are not significant in None. Fri Jun 23 00:51:26 2023 -Identified 10 variants which are only significant in Female. Fri Jun 23 00:51:26 2023 -Identified 18 variants which are only significant in Male. Fri Jun 23 00:51:26 2023 -Identified 18 variants which are significant in Both. Fri Jun 23 00:51:26 2023 Creating the scatter plot for effect sizes comparison... Fri Jun 23 00:51:27 2023 -Estimating SE for rsq using Jackknife method. Fri Jun 23 00:51:27 2023 -Calculating p values based on given null slope : 0 Fri Jun 23 00:51:27 2023 -Beta_se = 0.06938548104838471 Fri Jun 23 00:51:27 2023 -H0 beta = 0 , default p = 4.75e-18 Fri Jun 23 00:51:27 2023 -R se (jackknife) = 2.23e-02
save¶
In [15]:
Copied!
a = gl.compare_effect(path1="bbj_bmi_female.txt.gz",
cols_name_list_1=["SNP","P","REF","ALT","CHR","POS"],effect_cols_list_1=["BETA","SE"],
path2="bbj_bmi_male.txt.gz",
cols_name_list_2=["SNP","P","REF","ALT","CHR","POS"],effect_cols_list_2=["BETA","SE"],
label=["Female","Male","Both","None"],
xylabel_prefix="Per-allele effect size for ",
anno=True,
anno_het=True,
anno_diff=0.015,
allele_match=True,
sig_level=5e-8,
legend_title=r'$ P < 5 x 10^{-8}$ in:',
verbose=True,
save = "myplot.png",
save_args= {"dpi":300,"facecolor":"white"}
)
a = gl.compare_effect(path1="bbj_bmi_female.txt.gz",
cols_name_list_1=["SNP","P","REF","ALT","CHR","POS"],effect_cols_list_1=["BETA","SE"],
path2="bbj_bmi_male.txt.gz",
cols_name_list_2=["SNP","P","REF","ALT","CHR","POS"],effect_cols_list_2=["BETA","SE"],
label=["Female","Male","Both","None"],
xylabel_prefix="Per-allele effect size for ",
anno=True,
anno_het=True,
anno_diff=0.015,
allele_match=True,
sig_level=5e-8,
legend_title=r'$ P < 5 x 10^{-8}$ in:',
verbose=True,
save = "myplot.png",
save_args= {"dpi":300,"facecolor":"white"}
)
Fri Jun 23 00:51:28 2023 Start to process the raw sumstats for plotting... Fri Jun 23 00:51:28 2023 -Loading Male SNP list in memory... Fri Jun 23 00:51:34 2023 -Loading sumstats for Female: SNP,P,CHR,POS Fri Jun 23 00:51:40 2023 -Counting variants available for both datasets: 5961600 variants... Fri Jun 23 00:51:44 2023 -Using only variants available for both datasets... Fri Jun 23 00:51:44 2023 -Extract lead variants from Female... Fri Jun 23 00:51:44 2023 Start to extract lead variants... Fri Jun 23 00:51:44 2023 -Processing 5961600 variants... Fri Jun 23 00:51:44 2023 -Significance threshold : 5e-08 Fri Jun 23 00:51:44 2023 -Sliding window size: 500 kb Fri Jun 23 00:51:45 2023 -Found 948 significant variants in total... Fri Jun 23 00:51:45 2023 -Identified 20 lead variants! Fri Jun 23 00:51:45 2023 Finished extracting lead variants successfully! Fri Jun 23 00:51:46 2023 -Loading sumstats for Male: SNP,P,CHR,POS Fri Jun 23 00:51:56 2023 -Extract lead snps from Male... Fri Jun 23 00:51:56 2023 Start to extract lead variants... Fri Jun 23 00:51:56 2023 -Processing 5961600 variants... Fri Jun 23 00:51:56 2023 -Significance threshold : 5e-08 Fri Jun 23 00:51:56 2023 -Sliding window size: 500 kb Fri Jun 23 00:51:57 2023 -Found 1937 significant variants in total... Fri Jun 23 00:51:57 2023 -Identified 28 lead variants! Fri Jun 23 00:51:57 2023 Finished extracting lead variants successfully! Fri Jun 23 00:51:58 2023 Merging snps from Female and Male... Fri Jun 23 00:51:58 2023 -Extract statistics of selected variants from Female : SNP,P,REF,ALT,BETA,SE Fri Jun 23 00:52:03 2023 -Merging Female effect information... Fri Jun 23 00:52:06 2023 -Extract statistics of selected variants from Male : SNP,P,REF,ALT,BETA,SE Fri Jun 23 00:52:12 2023 -Merging Male effect information... Fri Jun 23 00:52:15 2023 -Updating missing information for Female ... Fri Jun 23 00:52:22 2023 -Updating missing information for Male ... Fri Jun 23 00:52:29 2023 -Assigning indicator ... Fri Jun 23 00:52:29 2023 -Aligning Male EA with Female EA ... Fri Jun 23 00:52:29 2023 -Aligned all EAs in Male with EAs in Female ... Fri Jun 23 00:52:29 2023 -No variants with EA not matching... Fri Jun 23 00:52:29 2023 -Calculating Cochran's Q statistics and peform chisq test... Fri Jun 23 00:52:29 2023 -Saving the merged data to: Female_Male_beta_sig_list_merged.tsv Fri Jun 23 00:52:29 2023 -Significant het: 20 Fri Jun 23 00:52:29 2023 -All sig: 46 Fri Jun 23 00:52:29 2023 -Het rate: 0.43478260869565216 Fri Jun 23 00:52:29 2023 -Identified 0 variants which are not significant in None. Fri Jun 23 00:52:29 2023 -Identified 10 variants which are only significant in Female. Fri Jun 23 00:52:29 2023 -Identified 18 variants which are only significant in Male. Fri Jun 23 00:52:29 2023 -Identified 18 variants which are significant in Both. Fri Jun 23 00:52:29 2023 Creating the scatter plot for effect sizes comparison... Fri Jun 23 00:52:31 2023 -Calculating p values based on given null slope : 0 Fri Jun 23 00:52:31 2023 -Beta_se = 0.06938548104838471 Fri Jun 23 00:52:31 2023 -H0 beta = 0 , default p = 4.75e-18 Fri Jun 23 00:52:32 2023 Saving plot: Fri Jun 23 00:52:32 2023 -Saved to myplot.png successfully!